A dataset of global tropical cyclone wind and surface wave measurements from buoy and satellite platforms

There are now a range of potential data sources for wind and surface wave conditions within tropical cyclones. These sources include: in situ buoy data and remote sensing data from satellite altimeters, scatterometers, and radiometers. In addition, data providing estimates of tropical cyclone tracks and wind field parameters are available from best track archives. The present dataset brings together this information in a single archive, providing the available data for each tropical cyclone from each of the data sources in a single file. The data consists of observations in a total of 2927 global tropical cyclones over the period from 1985 to 2017. Global statistics of the observations are provided, along with data on the geographic distribution of tropical cyclones within the database.

It should be noted that none of these data sources are homogeneous.The numbers of buoys deployed and satellites in orbit have increased with time, meaning that the observation frequency of TCs has changed over the last 30 years.In addition, the measurement technology has also changed.In recent years more buoys measure full directional spectra, rather than just bulk parameters, such as significant wave height and peak period.Similarly, the satellite technology has also changed with higher resolution and more consistent calibration of platforms.
In a series of papers, Tamizi and Young 11 and Tamizi et al. 38,39 combined data from the TC best track database IBTrACS 40 with buoy, altimeter, scatterometer and radiometer data to provide a large and comprehensive investigation of the wind and wave fields in TCs.In these studies, TC tracks were identified globally from the IBTrACS archive.NDBC buoy data and the various remote sensing products were then extracted when the given TC track passed close (550 km or 5°) to the buoy or satellite ground track.This is a large dataset, encompassing a total of 2927 TCs from all tropical cyclone basins.As data is sourced from a range of public archives, which each need to be separately searched to extract the required observations, this is not a simple task.The present database contains these data, stored by year/location/TC name.As such, it is a valuable archive bringing together TC track and wind field parameters with recorded wind and wave data.

Methods
Each of the data types used in the composite dataset are described below.Note that in the previous applications of these data 11,38,39 , track positions were interpolated in time to ensure they were consistent with observations of wave and wind quantities.The present dataset does not interpolate any data.Rather, the original track observations are provided at their original resolution (6 hours in most cases).

tropical cyclone tracks and parameters (IBtracS). The International Best Track Archive for Climate
Stewardship (IBTrACS) dataset (40) was developed by the NOAA (National Oceanographic and Atmospheric Administration) National Climatic Data Center.The archive synthesizes and merges best-track data from all official Tropical Cyclone Warning Centers and the WMO (World Meteorological Organisation) Regional Specialized Meteorological Centers.The dataset contains data including time, position, maximum sustained winds, minimum central pressure, p 0 and storm nature (i.e., tropical cyclone, tropical storm, etc.).In addition, information such as the radius to maximum winds, R m and the radius to gales R 34 is provided for some storms.The data are provided globally at 6-h intervals.Although the archive contains data beginning from 1848, data before the satellite period is obviously of lower quality.For the present database, only data after 1985 was extracted.
Figure 1 shows the global distribution of tropical cyclone tracks extracted from IBTrACS and contained within the dataset.The tracks, as shown in Fig. 1 contain data for storms which are classified as tropical storms, tropical cyclones, and extra tropical cyclones.The present database includes all storm types.If one wished to exclude extratropical cases, as in Tamizi and Young 11 , one can simply disregard data at latitudes higher than 40°.

NDBc Buoy data.
The NDBC operates the longest duration deep water wave buoy network in the world 12 .
Of particular relevance for the present application, this network covers the Atlantic, Pacific, and Gulf of Mexico regions where North American hurricanes occur.The NDBC buoy data typically includes hourly measurements of significant wave height, H s and other bulk parameters (wave period etc), and the one-dimensional energy density spectrum E(f), where f is wave frequency.In addition, for directional buoys, NDBC data contains the cross-spectral moments a 1 , a 2 , b 1 and b 2 .The significant wave height can be related to E(f) and the directional energy density spectrum, E(f, θ), where θ is wave propagation direction, by The directional wave spectrum, E(f, θ) can be represented as 41,42 , where In an approach termed the Fourier Expansion Method (FEM), Longuet-Higgins et al. 41 proposed that D(f, θ) takes the form The mean direction θ m (f) and the spreading parameter s(f) can be determined from the first two spectral moments a 1 (f) and b 1 (f) as (3) . The coefficient A(f) is a normalization factor.Although the FEM defined by (2) to ( 4) is a useful representation of the directional spectrum, the assumed cos 2s form is a significant simplification.More general representations such as the Maximum Likelihood Method 42,43 and Maximum Entropy Method 44 can also be defined in terms of the cross-spectral moments The NDBC buoys also measure wind speed.As anemometers on these buoys are at a range of heights, measurements need to be converted to a standard reference height, usually 10 m, assuming a neutral stability logarithmic boundary layer 45,46 .
The locations of the NDBC buoys from which TC data are included in the database are shown in Fig. 2. Note that tropical cyclone (hurricane) wave data are also available from the Coastal Data Information Program (CDIP).However, as these data are generally in finite depth locations, it was not included in the present database, which is for deep water.
The buoy data for the present database were downloaded from the NDBC 47 archive (https://www.ndbc.noaa.gov/).The same data can also be accessed in a more accessible form as described by Hall and Jensen 48 .In the present archive, the quantities, r 1 , r 2 and θ m were calculated as above.All other quantities are as in the NDBC archive.altimeter data.Radar altimeters have been in operation since 1985 and measure wind speed (U 10 ) and significant wave height (H s ) globally at an along-track resolution of approximately 10 km.As the radar altimeter is a "nadir-looking" instrument, it senses the ocean surface over a narrow beam directly below the satellite.As such, the cross-track resolution is low, with ground tracks separated by hundreds of kilometres.As the geographical extent of TCs is relatively small, altimeter data are not always available for such meteorological systems.Due to the global coverage and temporal extent (1985 to present-day) of the combined altimeter missions, however, there are extensive observations of TC wind and wave fields.Ribal and Young 45 , have developed a consistent database of global altimeter data from the 13 altimeters which were operational from 1985-2018.This database 49 has now been extended to 15 altimeters and is available at https://portal.aodn.org.au/.Although there is some degradation of altimeter data in high rain regions, a large quality-controlled dataset is available under TC conditions 11 .The altimeter data base has been calibrated against buoy data and cross-validated at cross-over points between altimeter missions operational at the same time 45 .
Scatterometer data.Scatterometers have been in operation globally since 1992 and measure wind speed (U 10 ) and direction (θ u ) at a resolution of between 12.5 km and 25 km.In contrast to altimeters, scatterometers measure over a broad swath up to 1400 km wide.As such, they image most locations on the Earth's surface twice per day.Ribal and Young 50 calibrated the various scatterometer missions since 1992 against buoy data.However, these calibrations are limited to wind speeds up to approximately 30 ms −1 .At higher wind speeds, scatterometers display a low bias due to reduced backscatter signal 51,52 .Chou et al. 53 have proposed a correction to ASCAT scatterometer winds to address this issue at high winds.This approach was extended by Ribal et al. 54 who developed specific correction relationships for the MetOp-A, MetOp-B, ERS-2 and OceanSat-2 scatterometers.For QuikSCAT they found no correction is required.These corrections can be applied to the data of Ribal and Young 45 for application in TC conditions.The present database uses the calibrations of Ribal and Young 45 .
Figure 3 shows a contour plot of the relative density of scatterometer observations of winds within the present TC database.As one would expect, it closely follows the distribution of TC tracks shown in Fig. 1.
Radiometer data.In a similar fashion to scatterometer, radiometers measure over a broad ground track swath with a spatial resolution of 25 km.The radiometer dataset 55 is extensive, commencing in 1986.However, radiometer returns are significantly degraded by heavy rain.As such, radiometer data can generally only provide reliable data in TC conditions for the periphery of storms.
composite data records for tropical cyclones.As noted above, these combined datasets provide detailed observations of wind and wave properties under TC conditions.A typical example of the composite data is shown in Fig. 4, with the track and available data for buoys, altimeter and scatterometer for Hurricane Katrina in the Caribbean and Gulf of Mexico during 2005.The data were extracted from database file 2005236N23285_ KATRINA.nc.The figure shows the broad distribution of buoy data available from the NDBC archive, the extensive altimeter tracks across the track of the hurricane and the broad swaths of scatterometer data.The in situ data at Buoy 42040 shows H s peaking at approximately 17 m and U 10 at 28 ms −1 .
Figure 5 shows histograms of wind, wave and TC parameters for data associated with the buoy observations in the database.More details are provided in Tamizi and Young 11 .The database of buoy observations consists of more than 2900 time series of wind speed/wave height from more than 350 TCs.The histograms show H s up to 18 m and U 10 up to 60 ms −1 .These values were recorded during the passage of TCs (hurricanes) with central pressure, p 0 down to 880HPa and values of velocity of forward movement, V fm up to 30 ms −1 .The data were taken within 10 times the radius to maximum wind, R m of the TC centre, with the TCs having values of radius to gales, R 34 up to 420 km.
The corresponding distributions of observed wind and wave data from the altimeter observations are shown in Fig. 6.As can be seen, the number of observations in the satellite observations is far larger than for buoy data.This is due to the high along-track spatial coverage of the altimeter, together with the fact that it is a global dataset compared to the buoy data being confined to North America.The full dataset contains more than 36,000 altimeter passes over TC wind/wave fields from more than 2,730 TCs.The distributions of the various parameters are similar to the buoy dataset.However, it is noticeable that the maximum recorded values of H s ≈ 16 m and U 10 ≈ 50 ms −1 .These values are both lower than the corresponding values from the buoy observations, despite the fact that the altimeter dataset contains many more observations from a larger set of TCs.This suggests that the altimeter database misses some of the extreme observations within the storms.This assumption is supported by comparisons of the distributions of the values of observation distance/R m .For the altimeter dataset, there are few observations near the centres of the TCs.This is because the QA process rejected many observations where the quality of the altimeter returns had been degraded by high rain rates near the centres of storms.
The corresponding distributions of wind speed from the scatterometer observations are not shown here due to space limitations but are similar to those shown in Figs. 5 and 6.This dataset consists of more than  13,500 scatterometer passes through more than 800 TCs.Due to the broad ground track swaths of scatterometers, there are more than 14,000,000 observations.

Data Records
The dataset is stored as NetCDF files, with one file per TC (a total of 2927 files).The file names largely follows the IBTrACS naming convention, such that it acts as a TC identifier.The general format is: yyyydddHaabbb_ TCName.nc,where Fig. 6 Summary of the altimeter TC database: (a) maximum significant wave height H s for each pass of an altimeter, (b) maximum wind speed U 10 for each pass of an altimeter, (c) central pressure p 0 of each storm in the database, (d) velocity of forward movement V fm of each storm in the database, (e) radius to gales R 34 of each storm in the database, and (f) minimum distance between altimeter and storm eye for each altimeter pass.
As an example, "2005236N23285_KATRINA.nc" contains the data for Hurricane Katrina with the data commencing on day 236 of year 2005 at latitude 23°N and longitude 285°E.The data stored in each file is described in Tables 1, 2 and 3, with Hurricane Katrina used as an example.
The dataset 56 can be downloaded from https://doi.org/10.26188/24471688.To provide uses an example of how to access and use the NetCDF files, the Matlab script used to produce Fig. 4 can be downloaded from https://doi.org/10.26188/24903117.
Tables 1, 2 and 3 below contain details of the Variable names, definitions of the measured quantities and the dimensions of the storage array within the NetCDF files of the TC database.

technical Validation
The instruments used to compile the present database are all extensively used for metocean applications, however, the extreme conditions in TCs pose issues for all instrumentation systems.Below, calibration and validation studies of these systems under TC conditions are considered.

Buoys.
Surface floating buoys have a history of more than 50 years of usage and form the basis for national wave measuring programs around the world.These systems use either acceleration or GPS principles to measure the time series of surface displacement (Note that NDBC data use acceleration).These systems have been extensively calibrated and validated and represent mature technology.
Anemometers (both cup and sonic) mounted on meteorological buoys also provide the mainstay of metocean wind measurements.Again, at high winds and large sea states, concerns have been raised about tilting of the buoy axis and shadowing by large waves 11,[22][23][24]   altimeters.The altimeter wind speed and wave height data used in the database are obtained from the AODN archive 45,55 .The altimeter data in the archive were calibrated and validated against extensive buoy data 45 .These data indicate no decrease in the accuracy of measurements of significant wave height up to 10 m.Beyond this value, there is almost not co-located data to make an assessment.Similar calibration of wind speed against buoy data are used for wind speeds up to approximately 25 ms −1 .For higher wind speeds, data suggests that the radar cross-section of the altimeter signal is less sensitive to wind speed and a correction to these calibrations is applied at higher wind speeds 57 .These altimeter calibration relations for wind speed have been subsequently validated against scatterometer and radiometer data 58 .
Scatterometer.The scatterometer data used in the present database were also sourced from the AODN archive.The scatterometer values of wind speed were calibrated and validated against global datasets 50 .Ribal et al. 54 subsequently proposed a high wind speed correction for TC conditions.The present dataset does not apply this high wind speed correction.However, it is a very simple process to correct the present data before application and this is recommended to users.

Fig. 1
Fig. 1 Tracks of tropical cyclones extracted from IBTrACS 60 and contained within the database.For clarity, only every 4 th track is shown.(Figure created with Matlab R2023a -mathworks.com).

Fig. 2
Fig. 2 Locations of NDBC buoys for which tropical cyclone in situ wind speed and wave data are included in the database.(Figure created with Matlab R2023a -mathworks.com).

Fig. 3
Fig. 3 Contour plot of the relative density (maximum value 1.0) of scatterometer observations within the TC database.(Figure created with Matlab R2023a -mathworks.com).

Fig. 4
Fig. 4 Observations of wind and wave conditions during Hurricane Katrina during 2005.(a) TC track in blue, buoy locations shown with red dots and altimeter observations by linear tracks.The location of NDBC Buoy 42040 is shown.(b) ground track swaths of scatterometer wind data.(c) significant wave height (H s ) as a function of time from buoy 42040.(d) wind speed (U 10 ) as a function of time from buoy 42040.(Figure created with Matlab R2023a -mathworks.com).

Fig. 5
Fig.5 Summary of data from the in situ buoy TC database: (a) H s for each transect of a TC, (b) U 10 for each transect, (c) p 0 of each storm in the database, (d) V fm of each storm in the database, (e) R 34 of each storm in the database, and (f) minimum distance between TC eye and buoy for each case in the database.

Table 1 .
. Dimensions of variables in NetCDF file.Example shown for Hurricane Katrina.

Table 2 .
Variables related to IBTrACS and satellite data in NetCDF files.Example shown for Hurricane Katrina.

Table 3 .
Three values (time, Hs, U10) defined by col. 2. Data values for MET_Buoys_List in column three.Fill data -999.Variables related to buoy data in NetCDF files.Example shown for Hurricane Katrina.